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Language emergence and evolution has recently gained growing attention through multi- 
agent models and mathematical frameworks to study their behavior. Here we investigate 
further the Naming Game, a model able to account for the emergence of a shared vocabu- 
lary of form-meaning associations through social/cultural learning. Due to the simplicity 
of both the structure of the agents and their interaction rules, the dynamics of this 
model can be analyzed in great detail using numerical simulations and analytical argu- 
ments. This paper first reviews some existing results and then presents a new overall 
understanding. 

Keywords: Cultural evolution; Language self-organization; Social interaction; Emergence 
of consensus; Statistical physics 



1. Introduction 

Language is based on a set of cultural conventions socially shared by a group. But 
how are these conventions established without a central coordinator and without 
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telepathy? The problem has been addressed by several disciplines, but it is only in 
the last decade that there has been a growing effort t o tackl e it scientifically using 
multi- agent models and mathematical approaches (cfrJU^EI f or a rev iew). Initially 
these models focused on the emergen ce of a s hared vocabulary, but increasingly 
attempts are made to tackle grammar | 1|4 | 5 | 6 | 7 | 

The proposed models can be classified as defending a sociobiological or a socio- 
cultural explanation. The sociobiological approach^, which includes the evolution- 
ary language game is based on the assumption that successful communicators, 
enjoying a selective advantage, are more likely to reproduce than worse communi- 
cators. If communication strategies are innate, then more successful strategies will 
displace rivals. The term strategy acquires its precise meaning in the context of a 
particular model. For instance, it can be a strategy for acquiring the lexicon of a 
language, i.e., a function fr om sam plings of observed behaviors to acquired commu- 
nicative behavior patterns I 8 | 9 | 10 l ; or ^ can simply coincide with the lexicon of the 
parents ^ or with some strong dispositio n to acquire a particular kind of syntax, 
usually called innate Universal Grammar H| 

In this paper we discus s a model, first proposed in that belongs to the so- 
ciocultural family H 3 | 14 | 15 | jj er6j good strategies do not necessarily provide higher 
reproductive success, but only higher communicative success and greater expres- 
sive power, and hence greater success in reaching cooperative goals, with less effort. 
Agents select better strategies exploiting cultural choices, feedback from communi- 
cation, and a sense of effort. Agents have not only the ability to acquire an existing 
system but to expand their rules to deal with new communicative challenges and 
to adjust their rules based on observing the behavior of others. Global coordi- 
nation emerges over cultural timescales, and language is seen as an evolving and 
self-organized system H^l While the sociobiological approach emphasizes language 
transmission following a vertical, genetic or ge nerational line, the sociocultural ap- 
proach emphasizes peer-to-peer interactional^. 

A second, fundamental distinction among the different models concerns the 
adopted mechanisms of social learning describ ing how stable dispositions are ex- 
changed and coordinated between individuals The two main appr oac hes are 
the so called observati onal learning model and the reinforcement model In the 
first approach™^ observation is the main ingredient of le arning a nd statistical 
sampling of observed behaviors determines their acquisition I8|9 | 10 | l | The second 
emphasizes the functional and inferential nature of conventional communication, 
the scaffolding role of the speaker, the restrictive power of the joint attention frame 
set up in the shared context, and the importance of pragmatic feedbac k in lang uage 
interaction. Here we adopt the reinforcement learning approach as in I 13 | 14 | 15 | 

In this paper we shall discuss a recently introduced model inspired by one 
of the first language game models known as the Naming Game ^1 It is able to 
account for the emergence of a shared set of conventions in a population of agents. 
Central control or co-ordination are absent, and agents perform only pairwise in- 
teractions following straightforward rules. Indeed, due to the simplicity of the in- 
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teraction scheme, the dynamics of the model can be studied both with massive 
simulations and analytical approaches. By doing so we import a pre-existing model 
into the statistical mechanics context (as opposed to the reverse which is often the 
case). 

In past work, sociocultural investigations largely focused on computational is- 
sues and the application for emergent communication in software agents or physical 
robots 1201 resulting in a lack of quantitative investigations. For instance, we shall 
discuss in detail later how the main features of the process leading the population 
to a final convergence state scale with the population size, whereas earlier work has 

ini I 

concentrated on studying very small populations . The price to pay for quanti- 
tative comprehension is a reduction in the number of aspects of the phenomena we 
can treat. Thus, the agent architectures we shall describe are indeed very basic and 
stylized, and are much too simple compared to the cognitive mechanisms humans 
employ, but on the other hand they allow us to study much more clearly what is 
crucial to obtain the desired global co-ordination based on only local interaction. 
The present paper shows that the crucial features are in fact simple and we consider 
this t o be one of our major contributions. Despite simplifying the original Naming 
Game^( we retained however its most important properties so that the interaction 
scheme could still be ported to real world robots or be used to explain the behavior 
of biological agents. 

The paper is organized as follows. In Sec. 12.11 we present the Naming Game 
model and discuss its basic phenomenology. Sec. [3] is devoted to the study of the 
role of population size. We investigate the scaling relations of some important quan- 
tities and provide analytical arguments to derive the relevant exponents. In Sec. 2] 
we look in more detail at the mechanisms that give rise to convergence, deepening 
the analysis presented in 021. In particular, we identify and explain the presence 
of a hidden timescale that governs the transition to the final consensus state. In 
Sec. [5] we focus on the relation between single simulation runs and averaged quan- 
tities, while in Sec. [6] we investigate the properties of the consensus word. We then 
analyze, in Sec. [3 a controlled case that sheds light on the nature of the symme- 
try breaking process leading to lexical convergence. Finally, in Sec. [H we discuss 
the most relevant features of the model and present some conclusions concerning 
particularly its connections with the fields of Opinion Dynamics on one hand and 
Artificial Intelligence on the other. 

2. The model 

2.1. Naming Game 

We present here the version of the Naming Game introduced in ^21 ( see a i so 1221 f or 
a comprehensive analysis of the model). The game is played by a population of N 
agents in pairwise interactions. As a side effect of a game, agents negotiate conven- 
tions, i.e., associations between forms (names) and meanings (for example individ- 
uals in the world), and it is obviously desirable that a global consensus emerges. 
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Because different agents can each independently invent a different name for the 
same meaning, synonymy (one meaning many words) is unavoidable. However we 
do not consider here the possibility of homonymy (one word many meanings) . In the 
invention process, in fact, we consider the situation where the number of possibly 
invented words is so huge that the probability that two players will ever invent the 
same word at two different times for two different meanings is practically negligible. 
This means that the dynamics of the inventories associated to different meanings 
are completely independent and the number of meanngs becomes a trivial param- 
eter of the model. As a consequence we can reduce, without loss of generality, the 
environment as composed by one single meaning and focus on how a population can 
establish a convention for expressing that meaning. In a generalized Naming Game, 
homonymy is not always an unstable feat ure and its survival depends in general on 
the size of the meaning and signal spaces ^1 Homonymy becomes crucial if, during 
a conversation, agents do not get precise feedback about the meaning. If there is 
more than one possible meaning compatible with the current situation (for example 
if the word expresses a category but we do not know which one) then homonymy 
would be unavoidable. This is not the case for the Naming Game while it becomes 
crucial for the so-called Guessing ^ and Category Game "I 

The model definition can be summarized as follows. We consider an environment 
composed by one single object to be named, the extension to many different objects 
being trivial if one neglects homonymy. Each individual is described by its inventory, 
i.e., a set of form-meaning pairs (in this case only names competing to name the 
unique object)) which is empty at the beginning of the game (t — 0) and evolves 
dynamically in time. At each time step (t = 1, 2, ..) two agents are randomly selected 
and interact: one of them plays the role of speaker, the other one that of hearer. 
The interactions obey the following rules (Fig. [I}: 

• The speaker transmits a name to the hearer. If its inventory is empty, the 
speaker invents a new name, otherwise it selects randomly one of the names 
it knows; 

• If the hearer has the uttered name in his inventory, the game is a success , 
and both agents delete all their names, but the winning one; 

• If the hearer does not know the uttered name, the game is a failure , and 
the hearer inserts the name in its inventory. 

Another important assumption of the model is that two agents are randomly 
selected at each time step. This means that each agent in principle can talk to any- 
body else, i.e., that the population is completely unstructured (homogeneous mixing 
assumpti on) The role of diffe rent agent topologies has been discussed extensively 
elsewhere I 24 | 25 | 26 | 27 | 28 | 29 | 22 ] ^ generalized model of the Naming Game has also 
been proposed, in which agents do not update their inventories deterministically 
after a success, but rather do that according to a certain probability^^. General- 
ized models exhibit interesting phenomenologies, including a non-equilibrium phase 
transition, but we do not consider them here. 
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Fig. 1. Naming game interaction rules. The speaker selects randomly one of its names, 
or invents a new name if its inventory is empty (i.e., we are at the beginning of the game). 
If the hearer does not know the uttered name, it simply adds it to its inventory, and 
the interaction is a failure. If, on the other hand, the hearer recognizes the name, the 
interaction is a success, and both agents delete from their inventories all their names but 
the winning one. 



Finally, it is worth stressing that the random selection rule adopted by the 
speaker to select the word to be transmitted, and the absence of weights to be asso- 
ciated with words, expressly violate the fundamental ingredients of earlier models ^. 
Indeed, as we are going to show, they turn out to be unnecessary. 

2.2. Basic phenomenology 

The most basic quantities describing the state of the population at a given time t 
are: the total number of names present in the system, N w (t), the number of different 
names known by agents, Nd(t), and the success rate, i.e. the probability of observing 
a successful interaction at a given time, S(t). In Figure [2] we report data concerning 
a population of N = 10 3 agents. The process starts with a trivial transient in 
which agents invent new names. It follows a longer period of time where the N/2 
(on average) different names are exchanged after unsuccessful interactions. The 
probability of a success taking place at this time is indeed very small (S(t) ~ 0) since 
each agent knows only a few different names. As a consequence, the total number 
of names grows, while the number of different names remains constant. However, 
agents keep correlating their inventories so that at a certain point the probability of 
a successful interaction ceases to be negligible. As fruitful interactions become more 
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Fig. 2. Basic global quantities, a) Total Number of names present in the system, N w (t); 
b) Number of different names, A^(t); c) Success rate S(t), i.e., probability of observing a 
successful interaction at a time t. The inset shows the linear behavior of S(t) at small times. 
All curves concern a population of N = 10 3 agents. The system reaches the final absorbing 
state, described by N w (t) = N, iV^(i) = 1 and S(t) = 1, in which a global agreement on 
the form (name) to assign to the meaning (individual object) has been reached. 



frequent the total number of names at first reduces its growth and then starts to 
decrease, so that the N w (t) curve presents a well identified peak. Moreover, after a 
while, some names start disappearing from the system. The process evolves with an 
abrupt increase in the success rate, with a curve S(t) which exhibits a characteristic 
S-shaped behavior, and a further reduction in the numbers of both total and different 
names. Finally, the dynamics ends when all agents have the same unique name and 
the system is in the desired convergence state. It is worth noting that the developed 
communication system is not only effective (each agent understands all the others) , 
but also efficient (no memory is wasted in the final state). 

From the inset of Figure [2] it is also clear that the S(t) curve exhibits a linear 
behavior at the beginning of the process: S(t) ~ t/N 2 . This can be understood 
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noting that, at early stages, most successful interactions involve agents which have 
already met in previous games. Thus the probability of success is proportional to the 
ratio between the number of couples that have interacted before time t, whose order 
is 0(t), and the total number of possible pairs, N(N — l)/2. The linear growth ends 
in correspondence with the peak of the N w curve, where it holds S(t) ~ 1/iV ' 5 , 
and the success rate curve exhibits a bending afterward, slowing down its growth 
till a sudden burst that corresponds to convergence. 

3. The role of system size 
3.1. Scaling relations 

A crucial question concerns the role played by the system size N. In particular, two 
fundamental aspects depend on N. The first is the time needed by the population 
to reach the final state, which we shall call the convergence time t conv . The second 
concerns the cognitive effort in terms of memory required by each agent in achieving 
this dynamics. This reaches its maximum in correspondence of the peak of the N w (t) 
curve. Figure [3] shows scaling of the convergence time t conv , and the time and height 
of the peak of N w (t), namely t max and N™ ax = N w (t max ). The difference time 
(tconv — t m ax) is also plotted. It turns out that all these quantities follow power law 
behaviors: t max ~ N a , t conv ~ N , N™ ax ~ N 1 and t dlff = {t conv - t max ) ~ N s , 
with exponents ct«/3«7«<5« 1.5. 

The values for a and 7 can be understood through simple analytical arguments. 
Indeed, assume that, when the total number of words is close to the maximum, each 
agent has on average cN a words, so that it holds a = a + 1. If we assume also that 
the distribution of different words in the inventories is uniform, the probability for 
the speaker to play a given word is l/(cN a ), while the probability that the hearer 
knows that word is 2cN a /N (where N/2 is the number of different words present 
in the system). The equation for the evolution of the number of words then reads: 



where the first term is related to unsuccessful interactions (which increase N w by 
one unit), while the second one to successful ones (which decrease N w by 2cN a ). 
At the maximum dN w (t max ) / dt = 0, so that, in the thermodynamic limit N — ► 00, 
the only possible value for the exponent is a — 1/2 which implies a = 3/2 in perfect 
agreement with data from simulations. 

For the exponent 7 the procedure is analogous, but we have to use the linear 
behavior of the success rate and the relation a — 1/2 we have just obtained. The 
equation for N w (t) now can be written as: 



If we impose dN w (t)/dt = 0, we find that the time of the maximum has to scale 
with the right exponent 7 = 3/2 in the thermodynamic limit. 
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Fig. 3. Scaling with the population size N. In the upper graph the scaling of the 
peak and convergence time, t m ax and t C onv, is reported, along with their difference, tdiff- 
All curves scale with the power law TV 1,5 . Note that t conv and t^i / / scaling curves present 
characteristic log-periodic oscillations (see Sec. 1 3 . 2 p . The lower curve shows that the max- 
imum number of words (peak height, N™ ax = N w (t m ax)) obeys the same power law 
scaling. 

The exponent for the convergence time, /?, deserves a more articulate discussion, 
and we can only provide a more naive argument, even though well supported by 
evidence from numerical simulations. We concentrate on the scaling of the interval of 
time separating the peak of N w (t) and the convergence, i.e., tdiff = (tconv — tmax) ~ 
t s ~ TV 1 - 5 , since we already have an argument for the time of the peak of the total 
number of words t max . tdiff is the time span required by the system to get rid of 
all the words but the one which survives in the final state. The problem cast in 
such a way, we argue that a crucial parameter is the maximum number of words 
the system stores at the beginning of the elimination phase. 

If we adopt the mean field assumption that at t = t max each agent has on 
average N™ ax /N ~ y/~N words (see [28] for a detailed discussion of such a mean field 
approximation), we see that, by definition, in the interval tdiff, each agent must 
have won at least once. This is a necessary condition to have convergence, and it is 
interesting to investigate the timescale over which this happens. Assuming that N 
is the number of agents who did not yet have a successful interaction at time t, we 
have: 



n = N(i-p sPw y 



(3) 
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Fig. 4. Evidences supporting the argument for the (3 exponent. Top: v(t) is the 
(non normalized) histogram of the times at which agents play their first successful in- 
teraction, while V(t) is the cumulative curve. It is clear that up to a time very close to 
convergence there are still agents that have never won. Thus, the investigation of the first 
time in which V(t) = 1 provides a good estimate of tconv- Data refer to a single run for 
a population of N = 10 5 agents. The N^(t) curve is also plotted, for reference, while the 
vertical dashed grey line indicates convergence time. Bottom: scaling of i^i / / with N for 
a system in which, at the beginning of the process, half of the population knows word A 
and the other half word B. Thus, N^(t = 0) = 2 and invention is eliminated. Experimental 
points are well fitted by tdiff ~ iVlogiV, as predicted by our argument (see text). A fit of 
the form tdiff ~ N S , on the other hand, turns out to be less accurate (data not shown). 



where p s = 1/N is the probability to randomly select an agent and p w = S(t) is 
the probability of a success. The latter is 0(l/N°- 5 ) at t max , and stays around that 
value for a quite long time span afterward. Indeed, as we have seen, the success 
rate S(t) grows linearly till the peak, where S(t) — ct max /N 2 ~ 1/N - 5 , and ex- 
hibits a bending afterward, before the final jump to S(t) — 1 (Fig. [2]). If we insert 
the estimates of p s and p w in eq. © , and we require the number of agents who 
have not yet had a successful interaction to be finite just before the convergence, 
i.e., N(t conv ) ~ 0(1), we obtain tdiff ~ N 3 / 2 logN . Thus, the leading term of 
the difference time tdiff ~ N is correctly recovered, and the necessary condition 
N (tconv) ~ 0(1) turns out to be also sufficient. The possible presence of the log- 
arithmic correction, on the other hand, cannot be appreciated in simulations due 
also to logarithmic oscillations in the tdiff curve (see following Sec. 13. 2|l . Finally, it 
is worth noting that the S(t) ~ 1/N ' 5 behavior can be understood also assuming 
that at the peak of N w (t) each agent has O(N - 5 ) words (mean field assumption), 
and that the average number of words in common between two inventories is 0(1) 
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Fig. 5. Log-periodic oscillations for convergence times. Rescaled values of t C onv 
and imai are plotted along with their ratio. The rescaled convergence times exhibit global 
oscillations that are well fitted by the function t oc sin(c + c'ln(iV)), where c and c are 
constants whose values are c ~ 1.0 and c « 0.4. 



(as confirmed by numerical simulations shown in Fig. I12p. 

We can test the hypothesis behind the above argument in two ways. First of all 
we can investigate the distribution v(t) of the times at which agents perform their 
first successful interaction. Remarkably, Fig. [4] (top) shows that this distribution 
extends approximately up to t conv , so that the time t* , at which V(t) = L v(t) = 1, 
turns out to provide a good estimate for t conv . Then, we can validate our approach 
studying a controlled case. Consider a simplified situation in which each agent starts 
the usual Naming Game knowing one of only two possible words, say A and B. 
Invention is then prevented, and for the peak of N w (t) it holds N™ ax ~ N. Noting 
that in this case we have S(t max ) ~ 0(1), and substituting this value in eq. ((3|), we 
obtain that tdiff ~ TV log TV. Indeed, this prediction is confirmed by simulations also 
for what concerns the logarithmic correction (Fig. |4] (bottom)), and our approach 
is supported by a second validation. 



3.2. Rescaling curves 

Since we know that the characteristic time required by the system to reach con- 
vergence scales as N 1 ' 5 we would expect a transformation of the form t — > t/N 3 ^ 2 
to yield a collapse of the global-quantity curves, such as S(t) or N w (t), relative to 
systems of different sizes. However this does not happen. 

The first reason is that the curve of the scaling of the convergence time with N 
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1 t S(t)=0.5 ~ 

Fig. 6. Rescaling of the success rate curves. Curves relative to different system sizes 
show different qualitative behavior if time is rescaled as i — > t/tgu\—o_§ — 1, where 

%(t)=0.5 ~ N 3 / 2 . Indeed, on this timescale, the transition between the initial disordered 
state and the final ordered one where S(t) w 1 (i.e., the disorder-order transition) becomes 
steeper and steeper as N grows. 

does follow a iV 3 / 2 trend, but presents a peculiar, seemingly oscillatory, behavior in 
logarithmic scale. This is already visible from Figure 02 but is clearer in Figure El 
where it is shown that the curve t conv /N 3 / 2 is well fitted by a function of the type 
t oc sin(c + d ln(iV)), where c and c' are constant^ The same figure also shows that 
such oscillations are absent, or at are least very reduced, in the curve of peak times, 

tmax ■ 

The deviations of the convergence time scaling curve from a pure power law 
have the effect of scattering rescaled curves, thus preventing any possible collapse. 
An easy solution to this problem is that of rescaling according to intrinsic features 
of each curve. In Figure [51 we have rescaled success rate S(t) curves following the 
transformation t — > t/t s ^ =0 5 — 1, where ts(t)=o.5 is the time in which the considered 
curve reaches the value 0.5 (with ts(t)=o.s ~ TV 1 - 5 , not shown). Interestingly we 
note that the curves still do not collapse. In particular, the transition between a 
disordered state in which there is almost no communication between agents (S(t) m 
0), to the final ordered state in which most interactions are successful (S(t) w 1) 

a It must be noted that, since the supposed oscillations should happen on logarithmic scale, it is 
hard to obtain data able to confirm their actual oscillatory behavior. Thus, the fit proposed here 
must be intended only as a possible suggestion on the true behavior of the irregularities of the 
tconv scaling curve. 
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Collapse of the success rate curves. The time rescaling transformation t — > 
Q 5 makes the different S(t) curves collapse. Since the time at which 



Fig. 7. 

(* - t S(t)=0.5)/ t s(t' 



the success rate is equal to 0.5 scales as TV 3 / 2 (data not shown), the transformation is 
equivalent to t — > (t — aN 3 ^ 2 )/N 5 ^ 4 . The collapse shows that the disorder-order transition 
between an initial disordered state in which S(t) » and an ordered state in which S(t) « 1 
happens on new timescale t ~ N® with 6 « 5/4. 



becomes steeper and steeper as iV becomes larger In other words, it is clear that 
the shape of the curves changes when we observe them on our rescaled timescale. 

Figure HJ suggests that the disorder-order transitions happen on a new timescale 
t ~ N e with 8 < j3, so that N e /t conv — > when iV — > cxd and the transition becomes 
instantaneous, on the rescaled timescale, in the thermodynamic limit. Indeed this is 
exactly the case and, as shown in Figure[7J the value 9 — 5/4 and the transformation 
t->(t- aN 3 / 2 )/N 5 / 4 produces a good collapse of the success rate curves relative to 
different N. In the next section we shall show how the right value for 9 can be derived 
with scaling arguments after a deeper investigation of the model dynamics ^1. 



4. The approach to convergence 
4.1. The domain of agents 

We have seen that agents at first accumulate a growing number of words and then, 
as their interactions become more and more successful, reduce the size of their 
inventories till the point in which all of them know the same unique word. More 
quantitatively, the evolution in time of the fraction of agents f n with inventory 
sizes n is shown in Figure [H The curves refer to a population of N = 10 3 agents 
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Fig. 8. Evolution in time of inventory sizes n (n = 1. . . 15). fn{t) is the fraction of 
agents whose inventory size is n at time t. In the right part, f n (t) decreases with increasing 
n. The process ends with all agents having the same unique word in their inventory, so 



that fx = 1. Curves obtained by averaging 500 simulation runs on a population of N 
agents. 



10 J 



and have been obtained averaging over several simulation runs. We see that the 
process starts with a rapid decrease of fo and a concomitant increase of the fraction 
of agents with larger inventories. After a while, however, successful interactions 
produce a new growth in the fraction of agents with small values of n. The process 
evolves until the point in which all agents have the same unique word and f\ — 1. 

Some of the initial-time regularities of the /„ curves can be easily described 
analytically. For instance, it is easy to write equations for the evolution of the 
number of species as long as S(t) = 0. We have: 

dfo/dt = -fo (4) 

dfn>l/dt = fn-1 — fn 

(5) 

These trivial relations allow to understand some features of the curves, like the 
exponential decay of fo, or the fact that, at early times, each /„ (n > 0) crosses 
the correspondent f n -i m correspondence of its maximum (as can be recovered 
imposing df n /dt = 0). However, generalizing eq. (jlj is not easy, since, as the dy- 
namics proceeds, one should take into account the correlations among inventories 
to estimate the probability of successful interactions, and the analytical solution of 
our Naming Game model is still lacking. 
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Fig. 9. Distribution P n of inventory sizes n. Curves obtained by a single simulation 
run for a population of N = 10 4 agents, for which tmax = 6.2 x 10 5 and t CO nv = 1.3 x 10 6 
time steps. Close to convergence the distribution is well described by a power law P n ~ 



More quantitative insights c an b e obtained looking at the distribution P n of 
inventory sizes n at hxed times Q^l, reported in Figure [9] for the case N = 10 4 
(see ^1 f or a detailed discussion of the P n behavior in different temporal regions 
and different topologies). We see that in early stages most agents tend to have 
large inventories, thus determining a peak in the distribution. When agents start 
to understand each other, however, the peak disappears and large n values keep 
decreasing. Interestingly, in correspondence with the jump of the success rate that 
leads to convergence, the histogram can be described by a power law distribution: 

P n ~ n-°g(n/VN) (6) 

with the cut-off function g(x) — 1 for x << 1 and g(x) — for x » 1. Numerically 
it turns out that 1 < a < 3/2. To be more precise, in Figure [9] it is shown that the 
value a 7/6 allows a good fitting of the P n at the transition, and from simulations 
it turns out that this is true irrespectively of the system size. 

Finally, it is also worth mentioning that, well before the transition, the larger 
number of words in the inventory of a hearer increases (linearly) the chances of 
success in a interaction (data not shown). The number of words known by the 
speaker, on the other hand, basically plays no role until the system is close to the 
transition. Here, small inventories are lik ely to contain the most popular word, thus 
yielding higher probability of success ^1 
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Fig. 10. Distribution w(R) of words of rank R. The most popular word has rank 
R = 1, the second R = 2, etc. The distribution follows a power law behavior w(R) ~ 7?~ p 
with an exponent that varies in time, while for high ranks it is truncated at R m N/2. Close 
to the disorder-order transition, however, the most diffused word abandons the distribution 
that keeps describing the less popular words. Data come from a single simulation run and 
concern a population of N = 10 4 agents. 



4.2. The domain of words 

While agents negotiate with each others, words compete to survive In Fig- 
ure [TU] the rank distribution of words at fixed times is reported. The most popular 
word is given rank 1, the second one 2 and so on. The first part of the distribution 
is well described by a power law function, with an exponent that decreases with 
time. In proximity of the disorder-order transition, however, the most popular word 
breaks the symmetry and abandons the power law behavior, which continues to 
describe well the remaining words. More precisely, the global distribution for the 
fraction of agents possessing the i?-ranked word, w(R), can be described as: 



w(R) = w(l)S 



R,l 



N w /N-w{l) 



(l-p)((iV/2)W-2W) 



— ) 

N/2' 



(7) 



where S is the Kronccker delta function (S a ^ = 1 iff a = b and S a ^ = if a ^ b) 
and the normalization factors are derived imposing that f™w(R)dR = N w /NB- 
On the other hand from equation ^ one gets, by a simple integration, the 



'We use integrals instead of discrete sums, an approximation valid in the limit of large systems. 
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It follows that w(-R)|/?>i — > as N — > oo, so that, in the thermodynamic limit 
w(l) ~ 0(1), i.e., the number of players with the most popular word, is a finite 
fraction of the whole population. 

4.3. Network view - The disorder-order transition 

We now need a more precise description of the convergence process A profitable 
approach consists in mapping the agents in the nodes of a network (see Figure ITTj) . 
Two agents are connected by a link each time that they both know the same word, 
so that multiple links arc allowed. For example, if m out of the n words known by 
agent A are present also in the inventory of agent B, they will be connected by m 
links. In the network, a word is represented by a fully connected sub-graph, i.e., 
by a clique, and the final coherent state corresponds to a fully connected network 
with all pairs connected by only one link. When two players interact, a failure 
determines the propagation of a word, while a success can result in the elimination 
of a certain number of words competing with the one used. In the network view, as 
shown in Figure 1111 this translates into a clique that grows when one of its nodes 
is represented by a speaker that takes part in a failure, and is diminished when one 
(or two) of its nodes are involved in a successful interaction with a competing word. 

To understand why the disorder-order transition becomes steeper and steeper, 
if observed on the right timescale, we must investigate the dynamics that leads 
to convergence. If we make the hypothesis that, when N is large, just before the 
transition all the agents have the word that will dominate, the problem reduces to 
the study of the rate at which competing words disappear. In different words, the 
crucial information is how the number of deleted links in the network, M4, scales 
with N. It holds: 



where is the average number of words known by each agent, w(R) is the prob- 
ability of having a word of rank R, and w(R)N is the number of agents that have 
that word (i.e., the size of the clique). On the other hand, considering the network 
structure, eq. [9] is the product of the average number of cliques involved in each 
deletion process [^p], multiplied by an integral stating, in probability, which clique 
is involved [w(-R)] and which is its size [w(R)N). The integral on R starts from the 
first deletable word, i.e., the second most popular, because of the assumption that 
all the successes are due to the use of the most popular word. 

In our case, for a w 7/6, we obtain that Md ~ iV 5 / 4 . Thus, from equation (jHJ, 
we have that the ratio M^/iV 3 / 2 ~ Nsi^- 1 ) goes to zero for large systems (since 
a « 7/6, and in general a > 1), and this explains the greater slope, on the system 
timescale, of the success rate curves for large populations (Figure U}. 
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(8) 
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Fig. 11. Agents network dynamics. Top Left: a link between two agents (i.e., nodes) 
exists every time they have a word in common in their inventories, so that multiple links are 
allowed. In this representation, a word corresponds to a fully connected (sub)set of agents, 
i.e., a clique; in Figure, the two cliques corresponding to words WABAKU and VALEM 
are highlighted. Top Right: the two highlighted agents have just failed to communicate, 
so that the word VALEM has been transmitted to the agent placed in the top of the 
graphical representation. It therefore enters into the enlarged clique corresponding to the 
transmitted word VALEM. Bottom: the two highlighted agents have just succeeded using 
word VALEM. The clique corresponding to the used word does not change in any respect, 
but the competing cliques (here that of WABAKU) are reduced. 
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4.4. The overlap functional 

We have looked at all the timescales involved in the process leading the population 
to the final agreement state. Yet, we have not investigated whether this convergence 
state is always reached. Actually, this is the case, and trivial considerations allow to 
clarify this point. First of all, it must be noticed that, according to the interaction 
rules of the agents, the agreement condition constitutes the only possible absorbing 
state of our model. The proof that convergence is always reached is then straight- 
forward. Indeed, from any possible state there is always a non-zero probability to 
reach an absorbing state in, for instance, 2(N — 1) interactions. For example, a pos- 
sible sequence is as follows. A given agent speaks twice with all the other (N — 1) 
agents using always the same word A. After these 2(N — 1) interactions all the 
agents have only the word A. Denoting with p the probability of the sequence of 
2(N — 1) steps, the probability that the system has not reached an absorbing state 
after 2(N — 1) iterations is smaller or equal to (1 —p). Therefore, iterating this pro- 
cedure, the probability that, starting from any state, the system has not reached an 
absorbing state after 2k(N — 1) iterations, is smaller than (1 — p) k which vanishes 
exponentially with k. The above argument, though being very simple and general, is 
exact. However, another perspective to address the problem of convergence consists 
in monitoring the lexical coherence of the system. To this purpose, we introduce 
the overlap functional O: 



where ai is the i th agent's inventory, whose size is fcj, and \ai D aj\ is the number 
of words in common between a$ and a,j . The overlap functional is a measure of the 
lexical coherence in the system and it is bounded, 0(t) < 1. A the beginning of 
the process it is equal to zero, 0(t = 0) = 0, while at convergence it reaches its 
maximum, 0(t = t conv ) = 1. 

From extensive numerical investigations it turns out that, averaged over several 
runs, the functional always grows, i.e., (0(t+l)) > (0(t)) (see FigurefT2|). Moreover, 
looking at the single realization, this function grows almost always, i.e., (0(t+l)) > 
0(t), except for a set a very rare configurations whose statistical weight appears to 
be negligible (data not shown). Even if it is not a proof in a rigorous sense, this 
monotonicity, combined with the fact that the functional is bounded, gives a strong 
indication that the system will indeed converge 

It is also interesting to note that eq. (|10p is very similar to the expression for 
the success rate S(t), which can formally be written as: 




(10) 




(11) 



where the intersection between two inventories are divided only by the inventory 
size of the speaker. Figure [12] shows that these two quantities exhibit a very similar 
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Fig. 12. Overlap functional 0(t). Top: it is shown the evolution in time of the overlap 
functional averaged on 1000 simulation runs (for a population of N = 10 3 agents). Curves 
for the success rate, S(t), and the average intersection between inventories, I(t), are also 
included. By definition, 0(t) < 1. It is evident that it holds (0(t + 1)) > (0(t)}, which, 
along with the stronger {0(t+ 1)) > 0(t) valid for almost all configurations (not shown), 
indicate that the system will reach the final state of convergence where 0{t) = 1. Bottom: 
The total number of words N w (t) is plotted for reference. 





behavior. However, while the overlap functional is equal to 1 only at convergence, 
this is not true for the success rate: if all agents had identical inventories of size 
n > 1 we would have S(t) = 1 and 0(t) = 1/ra. For this reason the success rate is 
not a suitable functional to prove convergence. 

Finally, in Fig. [12] we have plotted also the average intersection between inven- 
tories, i.e. 

I{t) = N(N- l) E^ na il- (12) 

V ' i>3 

Remarkably, it turns out that I(t) < 1 during all the process, even if in principle 
this quantity is not bounded. 

5. Single games 

We know that single realizations have a quite irregular behavior and can deviate 
significantly from average curves (Fig.[2|). It is therefore interesting to investigate to 
what extent average times and curves provide a good description of single processes. 

In Figure [TBT top) we have plotted the distribution of peak times for a population 
of N = 10 3 agents. It is clear that data cannot be fitted by a Gaussian distribution. 
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Fig. 13. 
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Peak and convergence time distributions. Top: the distribution of the peak 
clearly deviates from Gauss behavior. Bottom: the cumulative distribution of 



is well fitted by a Weibull distribution D(t) 

n4 „ _ ^ n n w i n0 ~_ ^ n a w i cA 



(exp(^)) 



in 



go- 

91 

The same function 



the convergence times t CO nv 

with fit parameters go » 4.9 x lO 4 , <?i fa 7.9 x 10 u and 32 ~ 9.6 x 10 
describes well also the peak time distribution (data not shown). Data refers to a population 
of N = 10 3 agents and are the result of 10 6 simulation runs. 



The same peculiar behavior is shown also by the distribution of the convergence 
times (Fig ITBTbottom)) and by that of the intervals between the time of the max- 
imum number of words and the time of convergence (data not shown). Thus, the 
non-Gaussian behavior appears to be an intrinsic feature of the model. In fact, as 
shown in Figure [ToT bottom') for the convergence times, all these distributions turn 
out to be well fitted (in their cumulative form) by an extreme value distribution: 

D(t) = exp (13) 

where go, gi and 52 are fit parameters 1 31 1 32 ] 

Extreme value distributions originated from the study of the distribution of the 
maximum (or minimum) in a large set of independent and identically distributed 
set of variables I3U3H ft turns out, however, that a generalization of these functions 
including a continuous shape parameter a, known as Gumbel distribution G a (x), 
has bee n ob served in many models ranging from turbulence and equilibrium critical 
systems to non-equilibrium models related to self-organized criticalityGS to 1// 
noise ^51 a nd many others systems (see and references therein). The Naming 
Game model provides another example. 

It must be noted, however, that there is no obvious theoretical explanation of 
the fact that extreme-value like distributions are found also in the study of the 
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Fig. 14. Single agents convergence time distribution. We define the convergence 
time of a single agent the last time in which it had to delete words after a successful 
interactions; f C onv(f) is the fraction of agents who reach convergence at time t. Top: 
distributions coming from 10 simulation runs are plotted. It is clear that distributions 
coming from different runs can be non-overlapping, i.e., that the distance between the 
peaks of single curves can be much larger than the average width of the same curves (that 
does not exhibit any strong dependence on the single run). Bottom: a single distribution 
is analyzed, showing that it can not be described by a Gauss distribution. The last agent 
to converge determines the global convergence time. Curves are relative to a population 
of N = 10 5 agents. 

fluctuations of global quantities. Yet, in many cases, these distributions are used 
simply like convenient fitting functions. Interestingly, it was recently shown that 
there is a connection between Gumbel functions and the statistics of global quan- 
tities expressed as sums of non identically distributed random variables, without 
the need of invoking extremal processes We can therefore argue that there is 
not necessarily a hidden extreme value problem in our model. In any case, a more 
rigorous explanation of the presence of Gumbel like distribution is left for future 
work. 

In Figure fhiT top) we show 10 single-run distributions of convergence times. Each 
curve illustrates the fraction of agents that converged at a given time in that run, 
fconv(t). We consider the single agent convergence time as the last time in which it 
had to delete words after a successful interaction. From Figure it is clear that the 
separation between the peaks of two different distributions can be much larger than 
the average width of a single curve. In other words, we see that the first moment 
of the distributions strongly depends on the single realization, while the second one 
does not. This information is crucial to interpret the curves shown in Figure [13] 
correctly. In fact, we now know that they are indeed representative of fluctuations 
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Fig. 15. Correlation between peak and convergence times (r m ax Tconv , respec- 
tively). Each run is represented by a point in the scatter plot. The dashed line is 
Tconv = Tmax and therefore no points can lay below it. The average times tconv and t ma x 
are also shown with a clearer (yellow) point at the center of the distribution (statistical 
errors are not visible on the scale of the graph). 



occurring among different runs, and do not describe simply the behavior of the last 
converging agent in a scenario in which most agents always converge, on average, 
at the same run-independent time. In Figure [W^bottom) it is shown that single run 
curves also deviate from Gauss behavior showing long tails for large times. 

Given these distributions of convergence and peak times, and also that their 
difference td<i//, behaves in the same way, it is interesting to investigate whether 
there is any correlation between these two times. In Figure [15] we present a scatter 
plot in which the axis indicate T conv and T ma x, respectively the convergence and 
peak times for a single run (so that t max = (T max ) and t conv — (r con „)). It is clear 
that the correlation between this two times is very feeble. Indeed, the knowledge 
of T max does not allow to make any sharp predictions on when the population will 
reach convergence in the considered run. 

Finally, Figure HU shows that the relative standard deviation of all the relevant 
global quantities (t max , tjiff, tconv and N™ ax ) decreases slowly as the system size 
TV grows. In general, if the ratio a{x) / (x) goes to zero as iV increases, the system is 
said to exhibit self-averaging, and this seems to be the case for the Naming Game. 
However, it is difficult to draw a definitive conclusion, due to the large amount of 
time needed to perform a significant number of simulation runs for large values of 
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Fig. 16. Scaling of the relative standard deviation a(x)/{x). The ratio between the 
standard deviation a and the corresponding (average) quantity is plotted as a function of 
the system size. In all cases the ratio decreases slightly, or stays constant, as the population 
size N grows. In particular, the decrease is more evident for N™ ax and t max , while t C onv 
and tfaff curves are almost constant for large N. However, data from our simulations are 
not sufficient to conclude whether the Naming Game exhibits self-averaging. The standard 

deviation of x is defined as o(x) = \J ^ 1 — TldZi ( x i ( x )) 2 j x i ls the measured 
value, (x) is the average value, and N runs is the number of simulation runs (here, N runs — 
1000). 

N. Seemingly, the system seems to show self-averaging for what concerns the peak 
height and time, but this does not seem the case for the time of con vergence. In any 
case, it is worth mentioning that Lu, Korniss and Szymanski conclude that a 
slightly modified version of the Naming Game model does not display self-averaging 
when the population is embedded in random geometric networks. 

6. Convergence Word 

As we have seen, the negotiation process leading agents to convergence can be seen 
as a competition process among different words. Only one of them will survive in 
the final state of the system. It is therefore interesting to ask whether it is possible 
to predict, at some extent, which word is going to dominate. 

According to the Naming Game dynamical rules, the only parameter that makes 
single words distinguishable is their creation time. Thus, it seems natural investi- 
gating whether the moment in which a word is invented can affect its chances of 
surviving. It turns out that this is indeed the case, as it is shown in Figure IT7l The 
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Fig. 17. Word survival probability. Top: The probability that a given word becomes 
the dominating one (i.e., the only one to survive when the system reaches the convergence 
state) is plotted as a function of its normalized invention position (see text for details). 
Early invention is clearly an advantageous factor. Bottom: the survival probability is now 
plotted in function of the invention time of words. The experimental distribution can be 
fitted by an exponential of the form W ~ (l/ r ) ex P( — ^/ r )> with r « 150. In both graphs, 
data have been obtained by 10 5 simulation runs of a population made of N = 10 3 agents. 



upper graph plots the probability for a word to become the dominating one as a 
function of its normalized creation position. This means that each word is identified 
by its creation order: the first invented word is labeled as 1, the second as 2 and so 
on. To normalize the labels, they are then divided by the last invented word. From 
Figure it is clear that early invented words have higher chances of survival. The 
supremacy can be better quantified if we plot the winning probability of a word 
as a function of its invention time, as it is done in the bottom graph of Figure 1171 
We find that data from simulations are well fitted by an exponential distribution of 
the form W = [l/r)exp{— t/r), indicating that the advantage of early invention is 
indeed quite strong. 

Finally, an interesting question concerns the behavior of the winning probability 
distribution as a function of the system size N. In Figure [T8l we show the distribu- 
tions as a function of normalized labels described above for three different system 
sizes, N = 10 2 , N — 10 3 and N — 10 4 . The advantage of earlier creation increases 
with the system size, but our data do not allow clear predictions about the behavior 
of the distribution in the thermodynamic, N — > oo, limit. We might sp ecul ate that 
the distribution collapses into a Dirac's delta of the first invented word^Sl. 
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Fig. 18. Role of the system size on the distribution of the winning word. The 

advantage of early invention increases in larger populations. 



7. Symmetry Breaking - A controlled case 

In the previous sections we have seen that the winning word is chosen by a symmetry 
breaking process ( section |4.2[) . This is true even if, as we have seen in section [BJ 
early invention increases the probability for a word to impose itself. Indeed, if we 
start with an artificial configuration in which each agent has a different word in its 
inventory, i.e., if we remove the influences of the invention process, the process still 
ends up in the usual agreement state (data not shown). 

In particular, we can concent rate on the case in which there are only two words 
at the beginning of the process ^Dl sa y A and B, so that the population can be 
divided into three classes: the fraction of agents with only A, n A > the fraction of 
those with onl y th e word B, n B , and finally the fraction of agents with both words, 
n AB (see also ^Sl for a similar model). Describing the time evolution of the three 
species is straightforward: 



dn A /dt = -n A n B + n AB + n A n AB (14) 
dn B /dt = —n A n B + n 2 AB + n B n AB 
dn AB /dt = +2n A n B - 2n 2 AB - (n A + n B )n AB 

The meaning of the different terms of the equations is clear. For instance, for 
dn A /dt we have that — n A n B considers the case in which an agent with the word B 
transmits it to an agent with the word A, n AB takes into account the probability 
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Fig. 19. Resistance against invasion. Two populations that converged separately on 
conventions A and B merge. In Figure it is plotted the probability S(ua) that convention A 
becomes the final accepted convention of the new population, versus the normalized size ha 
(where n^ + ng = 1) of the original population of A spreaders. As the total population size 
increases, the probability for the initially less diffused convention to impose itself decreases, 
as predicted by equations (|14|l . 



that two more agents with only the A word are created if two agents with both 
words happen to have a success with A, and riAnAB is due to the probability that 
an agent with only A has a success speaking to an agent with both A and B. 

The system of differential equations (fT4"f is deterministic. It has three fixed 
points in which the system can collapse depending on initial conditions. If = 
0) > ns{t = 0) [ng(t = 0) > nA(t — 0)] then at the end of the evolution we 
will have the stable fixed point ua = 1 [ns = 1] and, obviously, ub = hab = 
[ua — riAB — 0]. If, on the other hand, we start from riA(t = ) = ns(t = 0), then 
the equations lead to ua = Ub = 2tiab — b, with b ~ 0.18. ^ The latter situation 
is clearly unstable, since any external perturbation would make the system fall into 
one of the two stable fixed points. Indeed, it is never observed in simulations due 
to stochastic fluctuations that in all cases determine a symmetry breaking forcing 
a single word to prevail. 

Equations (|14p . however, are not only a useful example to clarify the nature of 
the symmetry breaking process. In fact, they also describe the interaction among 
two different populations that converged separately on two distinct conventions. In 
this perspective, eq. (|14p predicts that the population whose size is larger will impose 
its conventions. In the absence of fluctuations, this is true even if the difference is 
very small: B will dominate if ns{t = 0) = 0.5 + e and riA(t = 0) = 0.5 — e , for any 
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< e < 0.5 (we consider nAs{t = 0) = 0). Figure [T51 reports data from simulations 
in which the probability of success of the convention of the minority group nji, 
S(jia), was monitored as a function of the fraction ha (where ua + tib = 1)- The 
absence of fluctuations is partly recovered as the total number of agents grows, 
and in fact it turns out that, for any given ua < 0.5, the probability of success 
decreases as the system size is increased. Following eq. (Tl4"l) . in the thermodynamic 
limit (N — > oo) this probability goes to zero. 



8. Discussion and Conclusions 

The Naming Game is the simplest model able to account for the emergence of 

a s hared set of conventions in a population of agents. The main characteristics 
are™ 

• The negotiation dynamics between individuals: the interaction rules are 
asymmetric and feedback is an essential ingredient to reach consensus; 

• The memory of the agents: individuals can accumulate words, and only 
after many interactions they have to decide on the final word chosen; 

• The absence of bounds to the inventory size: the number of words is neither 
fixed nor limited. 

All these aspects derive from issues in Artifical Intelligence, namely to under- 
stand how an open population of physically embodied aut ono mous robots could 
self-organize communication systems grounded in the world ^21. The model is also 
relevant for all cases in which a distributed group of agents have to tacitly negotiate 
decisions, as in opinion spreading or market decisions^. Nevertheless the ingredients 
listed above are a bsen t from most of the well known opinion-dynamics models. In 
the Axelrod model S3 for instance, each agent is endowed with a fix ed-size vector of 
opinions, while in the Sznajd model S2 or the Voter model B 2 | 43 | 44 |^ ^ e p m j on can 
take only two discrete values, and a n a gent adopts deterministically the opinion of 
one of its neig hbors. Deffuant et al. 03 model the opinion as a unique variable and 
the evolution of two interacting agents is deterministic, while in the Hegselmann 
and Krause model SSI opinions evolve as an averaging process. Most of these models 
include in some way the concept of bounded confidence, according to which two in- 
dividuals do not interact if their opinions are not close enough, something which is 
entirely absent in the Naming Game. Interestingly, a recently proposed generalized 
version of the Naming Game, in which a simple parameter rules the consolida- 
tion behavior of the agents after a game, shows a non-equilibrium phase transition 
in which the final state can be consensus (as in the model we have analyzed in 
this paper), polarization (a finite number of conventions survives asymptotically) 
or fragmentation (the final number of conventions scales with the system size) 
thus showing some phenomena also found for most opinion dynamics mod els. 

Compared to earlier Semiotic Dynamics models of the Naming Game S3 this 
paper has made two contributions. The effort towards the definition of simple in- 
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teraction rules has helped to bring out the essential features needed to achieve a 
consensus state. Remarkably, we have shown that the weights typically associated 
with word-meaning pairs in all earlier Naming Game models are not crucial. The 
simplification does not impinge on the ability of the model to be used on embod- 
ied agents i.e., it does not introduce a global observer or other forms of global 
knowledge. 

Next, because of the simplicity of the presented model, we have been able to 
perform a comprehensive analysis of its behavior which has never been done with 
earlier models due to their complexity. We have investigated the basic features of 
the process leading the population to converge, and how the crucial quantities scale 
with system size. In this context, we have also revealed a hidden timescale that rules 
the transition between the initial state, in which there is no communication among 
agents, and the final one, in which there is global agreement. Then we have analyzed 
several other aspects of the whole process, such as its properties of convergence, the 
relation between single runs and averaged curves, and the different probabilities for 
single words to impose themselves. We have also studied the elementary case in 
which only two words are present in the system, which can be interpreted as the 
merging of two converged populations, that clarifies the role of stochastic fluctu- 
ations in the convergence process. Although many of these results have been seen 
in numerical simulations, we have here been able to perform for the first time a 
mathematical analysis. In future work, the techniques we have used will be applied 
to more complex forms of communication including grammatical language for which 
some Artificial Intelligence models already exist S3. 
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